Skip to content

Comments

feat: add per-user memory with LLM extraction, dedup, and decay#67

Merged
Koopa0 merged 1 commit intomainfrom
feat/memory-integration
Feb 17, 2026
Merged

feat: add per-user memory with LLM extraction, dedup, and decay#67
Koopa0 merged 1 commit intomainfrom
feat/memory-integration

Conversation

@Koopa0
Copy link
Owner

@Koopa0 Koopa0 commented Feb 17, 2026

  • PostgreSQL + pgvector native memory store (two-threshold dedup, time-decay)
  • LLM-powered fact extraction with nonce-delimited prompt injection defense
  • LLM conflict arbitration for near-duplicate memories

  - PostgreSQL + pgvector native memory store (two-threshold dedup,
    time-decay)
  - LLM-powered fact extraction with nonce-delimited prompt injection
    defense
  - LLM conflict arbitration for near-duplicate memories
@Koopa0 Koopa0 merged commit 6a27145 into main Feb 17, 2026
7 of 8 checks passed
@Koopa0 Koopa0 deleted the feat/memory-integration branch February 17, 2026 17:00
Comment on lines +1 to 11
DROP TABLE IF EXISTS memories;
DROP TABLE IF EXISTS messages;

-- ============================================================================
-- Drop Sessions Table (including indexes)
-- ============================================================================

DROP INDEX IF EXISTS idx_sessions_owner_id;
DROP INDEX IF EXISTS idx_sessions_updated_at;
DROP TABLE IF EXISTS sessions;

-- ============================================================================
-- Drop Documents Table (including indexes)
-- ============================================================================

DROP INDEX IF EXISTS idx_documents_owner;
DROP INDEX IF EXISTS idx_documents_metadata_gin;
DROP INDEX IF EXISTS idx_documents_source_type;
DROP INDEX IF EXISTS idx_documents_embedding;
DROP TABLE IF EXISTS documents;

-- ============================================================================
-- Drop Extensions
-- Note: Only drop if no other schemas depend on it
-- ============================================================================

DROP EXTENSION IF EXISTS vector;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexes are dropped after their associated tables (e.g., indexes on 'sessions' and 'documents' are dropped after the tables themselves). This can cause errors if the table no longer exists when attempting to drop the index. Recommendation: Drop all indexes before dropping their associated tables to avoid dependency errors.

Comment on lines +1 to 11
DROP TABLE IF EXISTS memories;
DROP TABLE IF EXISTS messages;

-- ============================================================================
-- Drop Sessions Table (including indexes)
-- ============================================================================

DROP INDEX IF EXISTS idx_sessions_owner_id;
DROP INDEX IF EXISTS idx_sessions_updated_at;
DROP TABLE IF EXISTS sessions;

-- ============================================================================
-- Drop Documents Table (including indexes)
-- ============================================================================

DROP INDEX IF EXISTS idx_documents_owner;
DROP INDEX IF EXISTS idx_documents_metadata_gin;
DROP INDEX IF EXISTS idx_documents_source_type;
DROP INDEX IF EXISTS idx_documents_embedding;
DROP TABLE IF EXISTS documents;

-- ============================================================================
-- Drop Extensions
-- Note: Only drop if no other schemas depend on it
-- ============================================================================

DROP EXTENSION IF EXISTS vector;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration script does not wrap the DROP statements in a transaction. If an error occurs partway through, the database could be left in an inconsistent state. Recommendation: Wrap all statements in a transaction block (e.g., BEGIN; ... COMMIT;) to ensure atomicity.

CREATE TABLE IF NOT EXISTS sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
title TEXT,
owner_id TEXT NOT NULL DEFAULT '',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Data Integrity and Security Issue:

The owner_id column in the sessions table is defined as TEXT NOT NULL DEFAULT ''. This allows sessions to be created with an empty string as the owner, which may lead to orphaned sessions and complicate access control. It is recommended to:

  • Remove the default value and require explicit assignment of a valid owner.
  • Alternatively, use NULL to indicate no owner and enforce ownership at the application level.

Example:

owner_id TEXT NOT NULL

Or, if nullable:

owner_id TEXT

Comment on lines +99 to +100
CREATE UNIQUE INDEX idx_memories_owner_content_unique
ON memories(owner_id, md5(content)) WHERE active = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deduplication Vulnerability via MD5 Hash Collisions:

The unique index on md5(content) for active memories may allow hash collisions, potentially permitting duplicate content. Additionally, deduplication does not apply to inactive memories, which could result in inconsistent data. Consider using a stronger hash function (e.g., SHA256) or a direct unique constraint on the content column if performance and storage allow.

Example (if using SHA256):

ON memories(owner_id, encode(digest(content, 'sha256'), 'hex')) WHERE active = true;

Comment on lines 58 to 66
//
// Shutdown order:
// 1. Cancel context (signals background tasks to stop)
// 2. Close DB pool
// 3. Flush OTel spans
// 2. Wait for background goroutines (scheduler) to exit
// 3. Close DB pool
// 4. Flush OTel spans
func (a *App) Close() error {
a.closeOnce.Do(func() {
slog.Info("shutting down application")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource cleanup error handling is missing in Close().

The Close method does not capture or propagate errors from dbCleanup or otelCleanup. If these cleanup functions fail, the error will be silently ignored, potentially masking critical shutdown failures. Consider aggregating errors from each cleanup step and returning a combined error:

var err error
if a.dbCleanup != nil {
    if cerr := a.dbCleanup(); cerr != nil {
        err = errors.Join(err, cerr)
    }
}
// ... same for otelCleanup
return err

Comment on lines +103 to +111
func FuzzContainsSecrets(f *testing.F) {
f.Add("hello world")
f.Add("sk-1234567890abcdefghijklmnop")
f.Add("")
f.Add("password=secret123456")
f.Fuzz(func(_ *testing.T, input string) {
ContainsSecrets(input) // must not panic
})
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fuzz test does not validate correctness of output

The fuzz test for ContainsSecrets (lines 103-111) only checks that the function does not panic, but does not assert the correctness of its output. This could allow silent failures if the function starts returning incorrect results without panicking. To improve robustness, consider adding assertions for expected output on known inputs, or at least logging unexpected results for further analysis.

Example improvement:

f.Fuzz(func(t *testing.T, input string) {
    got := ContainsSecrets(input)
    // Optionally assert or log output for known cases
    // t.Logf("ContainsSecrets(%q) = %v", input, got)
})

Comment on lines +46 to +50
if n, err := s.store.UpdateDecayScores(ctx); err != nil {
s.logger.Warn("decay update failed", "error", err)
} else if n > 0 {
s.logger.Debug("decay scores updated", "count", n)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling Limitation:

If UpdateDecayScores fails, the error is only logged and not propagated or retried. This could result in persistent data inconsistencies if failures are not transient. Consider implementing a retry mechanism or escalating the error to ensure critical failures are not silently ignored.

Example:

for retries := 0; retries < maxRetries; retries++ {
    n, err := s.store.UpdateDecayScores(ctx)
    if err == nil {
        if n > 0 {
            s.logger.Debug("decay scores updated", "count", n)
        }
        break
    }
    s.logger.Warn("decay update failed", "error", err)
    time.Sleep(retryDelay)
}

Comment on lines +52 to +56
if n, err := s.store.DeleteStale(ctx); err != nil {
s.logger.Warn("stale expiry failed", "error", err)
} else if n > 0 {
s.logger.Info("expired stale memories", "count", n)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling Limitation:

Similar to the previous block, if DeleteStale fails, the error is only logged. This could lead to stale data persisting indefinitely. Consider adding a retry mechanism or alerting to ensure failures are addressed promptly.

Example:

for retries := 0; retries < maxRetries; retries++ {
    n, err := s.store.DeleteStale(ctx)
    if err == nil {
        if n > 0 {
            s.logger.Info("expired stale memories", "count", n)
        }
        break
    }
    s.logger.Warn("stale expiry failed", "error", err)
    time.Sleep(retryDelay)
}

Comment on lines +127 to +131
defer func() {
if rbErr := tx.Rollback(ctx); rbErr != nil && !errors.Is(rbErr, pgx.ErrTxClosed) {
s.logger.Debug("transaction rollback", "error", rbErr)
}
}()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred Rollback Always Executes After Commit

The deferred rollback in the Add method will always execute, even after a successful commit. This can result in unnecessary debug logs and may mask real rollback errors. To improve reliability and clarity, track whether the transaction has been committed and only attempt rollback if it has not:

committed := false
// ...
if err := tx.Commit(ctx); err != nil {
    return fmt.Errorf("committing memory transaction: %w", err)
}
committed = true
// ...
defer func() {
    if !committed {
        if rbErr := tx.Rollback(ctx); rbErr != nil && !errors.Is(rbErr, pgx.ErrTxClosed) {
            s.logger.Debug("transaction rollback", "error", rbErr)
        }
    }
}()

Comment on lines +382 to +433
func (s *Store) evictIfNeeded(ctx context.Context, ownerID string) error {
var count int
if err := s.pool.QueryRow(ctx,
`SELECT COUNT(*) FROM memories WHERE owner_id = $1 AND active = true`,
ownerID,
).Scan(&count); err != nil {
return fmt.Errorf("counting memories: %w", err)
}

if count <= MaxPerUser {
return nil
}

excess := count - MaxPerUser

// First try to evict inactive memories.
tag, err := s.pool.Exec(ctx,
`DELETE FROM memories
WHERE id IN (
SELECT id FROM memories
WHERE owner_id = $1 AND active = false
ORDER BY updated_at ASC, id ASC
LIMIT $2
)`,
ownerID, excess,
)
if err != nil {
return fmt.Errorf("evicting inactive: %w", err)
}

remaining := excess - int(tag.RowsAffected())
if remaining <= 0 {
return nil
}

// Evict oldest active by created_at.
_, err = s.pool.Exec(ctx,
`DELETE FROM memories
WHERE id IN (
SELECT id FROM memories
WHERE owner_id = $1 AND active = true
ORDER BY created_at ASC, id ASC
LIMIT $2
)`,
ownerID, remaining,
)
if err != nil {
return fmt.Errorf("evicting oldest active: %w", err)
}

return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eviction Failure Handling May Allow Resource Exhaustion

In evictIfNeeded, eviction failures are only logged as warnings and not propagated. If eviction is critical for enforcing memory limits, this could allow users to exceed MaxPerUser, potentially leading to resource exhaustion. Consider whether eviction failures should be handled more strictly, such as retrying or propagating the error to the caller, depending on the application's requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant